Estimating small frequency moments of data stream: a characteristic function approach
نویسندگان
چکیده
We consider the problem of estimating the first moment of a data stream defined as F1 = ∑ i∈{1,2,...,n}∣fi∣ to within 1± -relative error with high probability. Several algorithms are wellknown for this problem including the median estimator over p-stable sketches by Indyk [11], the geometric means estimator over p-stable sketches by Li [13] and the Hss sketch based algorithm in [8]. The current best algorithm is given by Kane, Nelson and Woodruff in [12] that uses space O( −2 log(mM)) and is proved to be space-optimal. In this paper, we present a novel, space-optimal algorithm for estimating Fp with an elementary analysis that is based on the characteristic function of stable distributions.
منابع مشابه
Estimating Frequency Moments of Streams
We will develop algorithms that can approximate Fk by making one pass of the stream and using a small amount of memory o(n+m). Frequency moments have a number of applications. F0 represents the number of distinct elements in the streams (which the FM-sketch from last class estimates using O(log n) space. F1 is the number of elements in the stream m. F2 is used in database optimization engines t...
متن کاملEstimating Frequency Moments of Streams
We will develop algorithms that can approximate Fk by making one pass of the stream and using a small amount of memory o(n+m). Frequency moments have a number of applications. F0 represents the number of distinct elements in the streams (which the FM-sketch from last class estimates using O(log n) space. F1 is the number of elements in the stream m. F2 is used in database optimization engines t...
متن کاملInstructor : Chandra Chekuri Scribe : Chandra Chekuri 1 Estimating Frequency Moments in Streams
A significant fraction of streaming literature is on the problem of estimating frequency moments. Let σ = a1, a2, . . . , am be a stream of numbers where for each i, ai is an intger between 1 and n. We will try to stick to the notation of using m for the length of the stream and n for range of the integers1. Let fi be the number of occurences (or frequency) of integer i in the stream. We let f ...
متن کاملMeasuring technological gap ratio of wheat production using StoNED approach to metafrontier
The aim of this paper is to use the concept of the metafrontier function to study the determination of efficiency differentials and Technological Gap Ratio (TGR) on wheat production in Khorasan Razavi province. In this study, we used the metafrontier function and group frontier based on the concept of Stochastic Nonparametric Envelopment of Data analysis (StoNED). The data used in this stud...
متن کاملBetter Bounds for Frequency Moments in Random-Order Streams
Estimating frequency moments of data streams is a very well studied problem [1–3,9,12] and tight bounds are known on the amount of space that is necessary and sufficient when the stream is adversarially ordered. Recently, motivated by various practical considerations and applications in learning and statistics, there has been growing interest into studying streams that are randomly ordered [3,4...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- CoRR
دوره abs/1005.1122 شماره
صفحات -
تاریخ انتشار 2010